Title: Social Media Sentiment Analysis¶

In [44]:
from IPython.display import HTML

# URL of the image
image_url = 'https://www.welovedigitalmarketing.com/wp-content/uploads/2021/08/Tracking-the-Important-Social-Media-Analytics.png'

# Constructing the HTML code to display the image
html_code = f'<img src="{image_url}" width="900">'

# Display the image
HTML(html_code)
Out[44]:

Objective:¶

The objective of social media sentiment analysis is to systematically identify, extract, quantify, and study affective states and subjective information from text data generated by users on social media platforms. Here are the key objectives and benefits of conducting sentiment analysis on social media data:¶

Understanding Public Sentiment: To gauge public opinion regarding topics like brands, products, services, policies, or events.¶

Brand Monitoring: To monitor brand reputation by analyzing how people feel about a brand or company, which can inform public relations and marketing strategies.¶

Market Research and Analysis: To understand consumer needs and preferences, which can aid in product development and targeted marketing.¶

Customer Service and Support: To quickly identify and respond to negative customer experiences or feedback, thereby improving customer satisfaction and loyalty.¶

Political Campaigning: To assess public reaction to campaigns, policies, or political figures, which can influence campaign strategies.¶

Trend Analysis: To detect shifts in public mood or sentiment trends over time, which can predict market movements or societal changes.¶

Crisis Management: To identify potential crises brewing online, allowing organizations to take proactive measures to mitigate damage.¶

Competitive Analysis: To compare sentiment towards competitors, which can uncover strengths and weaknesses in comparison to market rivals.¶

In summary, social media sentiment analysis provides valuable insights that can influence decision-making across various levels of an organization, from marketing to customer service to product development.¶

Data Exploration¶

In [1]:
import pandas as pd
import numpy as np
In [2]:
df = pd.read_csv(r'C:\Users\BHAVIN\Desktop\UOP Sem 1\Personal Project\Project 3 - Sentiment Analysis\sentimentdataset.csv')
In [3]:
#checking null values
df.isna().sum()
Out[3]:
Unnamed: 0.1    0
Unnamed: 0      0
Text            0
Sentiment       0
Timestamp       0
User            0
Platform        0
Hashtags        0
Retweets        0
Likes           0
Country         0
Year            0
Month           0
Day             0
Hour            0
dtype: int64
In [4]:
!pip install wordcloud
Requirement already satisfied: wordcloud in d:\python\lib\site-packages (1.9.3)
Requirement already satisfied: pillow in d:\python\lib\site-packages (from wordcloud) (9.4.0)
Requirement already satisfied: matplotlib in d:\python\lib\site-packages (from wordcloud) (3.7.0)
Requirement already satisfied: numpy>=1.6.1 in d:\python\lib\site-packages (from wordcloud) (1.23.5)
Requirement already satisfied: contourpy>=1.0.1 in d:\python\lib\site-packages (from matplotlib->wordcloud) (1.0.5)
Requirement already satisfied: pyparsing>=2.3.1 in d:\python\lib\site-packages (from matplotlib->wordcloud) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in d:\python\lib\site-packages (from matplotlib->wordcloud) (2.8.2)
Requirement already satisfied: packaging>=20.0 in d:\python\lib\site-packages (from matplotlib->wordcloud) (22.0)
Requirement already satisfied: kiwisolver>=1.0.1 in d:\python\lib\site-packages (from matplotlib->wordcloud) (1.4.4)
Requirement already satisfied: fonttools>=4.22.0 in d:\python\lib\site-packages (from matplotlib->wordcloud) (4.25.0)
Requirement already satisfied: cycler>=0.10 in d:\python\lib\site-packages (from matplotlib->wordcloud) (0.11.0)
Requirement already satisfied: six>=1.5 in d:\python\lib\site-packages (from python-dateutil>=2.7->matplotlib->wordcloud) (1.16.0)
In [5]:
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import pandas as pd


# Define the min_df and max_df parameters for TfidfVectorizer
#min_df = 0.001
#max_df = 0.75

#Extracting English words from the text
tfidf = TfidfVectorizer(stop_words='english')
X = tfidf.fit_transform(df['Text'].fillna(''))

# Retrieve the terms found by the vectorizer
terms = tfidf.get_feature_names_out()

# Sum the TF-IDF scores for each term across all documents
sums = X.sum(axis=0)

# Flatten the sums array and convert to a list
sums = sums.A1.tolist()

# Create a dictionary with terms and their summed scores
freqs = dict(zip(terms, sums))
In [6]:
X.shape
Out[6]:
(732, 2389)
In [7]:
freqs
Out[7]:
{'ablaze': 0.3743133741652681,
 'abstract': 0.7758276502269683,
 'abyss': 1.5768819099523883,
 'academic': 0.6641837768566297,
 'acceptance': 3.164689256138543,
 'accepts': 0.327556629856786,
 'accidentally': 1.3972024741683269,
 'accomplished': 0.899111881251432,
 'accomplishing': 0.5020354623940118,
 'accomplishment': 2.05145014575037,
 'achieve': 0.80730042392012,
 'achieved': 0.7926867203408545,
 'achievement': 2.1438125157547425,
 'achievements': 0.6801751439560559,
 'achieving': 2.2536624951175854,
 'aching': 0.44852075972177685,
 'acoustic': 0.36269358335367163,
 'act': 0.8986043553230838,
 'action': 1.067707527900664,
 'activated': 0.5804444409325654,
 'activities': 0.6308463204613257,
 'actor': 0.6588502535981346,
 'acts': 1.085342103666692,
 'adele': 0.697656541609925,
 'admiration': 0.764607657503759,
 'admiring': 0.9700637113111164,
 'adopted': 0.49860558594731424,
 'adorable': 0.5451156696590179,
 'adoration': 0.959708492360543,
 'adrenaline': 0.7055624754874817,
 'adrift': 0.8000497197900849,
 'adventure': 3.426344365774224,
 'adventurer': 0.4001434935061071,
 'adventures': 0.9873528611271685,
 'adversity': 0.29751782762710344,
 'aesthetic': 0.29723468436514405,
 'affair': 0.2772492378396552,
 'affect': 0.2891674535609628,
 'affection': 0.4522656352264657,
 'affectionate': 1.0372449787183053,
 'afloat': 0.3509657287230741,
 'afternoon': 0.635347646325918,
 'aftertaste': 0.3101288450018355,
 'age': 1.5118182624283243,
 'ageless': 0.3149576786902122,
 'aging': 0.34709956326803354,
 'ahead': 2.1105698447730257,
 'ai': 0.5488788943232074,
 'air': 3.772167409772652,
 'airplanes': 0.3073430838685552,
 'akin': 0.7240664478166473,
 'album': 1.2135987847142224,
 'alleys': 0.689504938136739,
 'alps': 0.33740068333895656,
 'amazed': 0.3731467220125721,
 'amazing': 0.5606691635746824,
 'amazon': 0.37811301790970403,
 'ambivalence': 2.0733122764704826,
 'amid': 0.8237034978275461,
 'amidst': 1.6510237913433465,
 'amused': 0.8917490851411334,
 'amusement': 0.4772440732164973,
 'amusing': 0.5432736849787542,
 'ancient': 3.5551272790576496,
 'anger': 0.5148908673953384,
 'angkor': 0.3537369944388286,
 'announcement': 0.49840509806774486,
 'announcements': 0.30452323687539495,
 'annoyance': 0.3278432134102421,
 'answers': 0.36693443970957534,
 'anthem': 0.3694156419108118,
 'anthemic': 0.41286043294044605,
 'anticipated': 0.6497443186895973,
 'anticipating': 1.546949070032642,
 'anticipation': 1.707421710210077,
 'antics': 0.8917490851411334,
 'anxiety': 1.727916504990298,
 'apart': 0.4506788475292253,
 'applauding': 0.3716182971281748,
 'applauds': 0.3382076713882025,
 'applause': 0.29270231348496456,
 'appreciating': 0.5906118203473788,
 'appreciation': 1.415012454905932,
 'apprehensive': 0.6175654130086119,
 'archaeological': 0.8817143606302,
 'architectural': 0.3731467220125721,
 'architecture': 1.1018988674090695,
 'argument': 0.8603694794912115,
 'ariana': 0.3727084418531491,
 'arise': 0.4312058920948606,
 'aroma': 0.9000396375869744,
 'aromas': 0.3143434282315866,
 'arousal': 2.0658299454509934,
 'art': 5.210935317789404,
 'artclass': 0.3112186980033984,
 'artclassadventures': 0.33244007844611334,
 'artgallery': 0.3019792719693925,
 'artist': 1.4981081505238358,
 'artistic': 0.9514776386281865,
 'artistry': 0.8559654620720023,
 'ashes': 0.3289326178612655,
 'assembly': 0.30452323687539495,
 'assignments': 0.42976264296881417,
 'assuredness': 0.9391477693469601,
 'astronomy': 0.6295656043832729,
 'astronomyclub': 0.3437646916222524,
 'athlete': 0.30658800725050434,
 'atlas': 0.3462939561428735,
 'atmosphere': 1.2551639162865187,
 'attempt': 0.3099173613121918,
 'attempting': 1.1636014491409092,
 'attempts': 0.6397262590591828,
 'attended': 1.747405175656192,
 'attending': 3.4558774945512094,
 'attic': 0.42413441020170045,
 'audience': 0.9335695946272065,
 'author': 0.5436080359552504,
 'autumn': 0.7164927382947353,
 'avoided': 0.32877518535606975,
 'avoiding': 0.7971167423494943,
 'await': 0.8458595227095976,
 'awaited': 1.442556330582428,
 'awaits': 0.3773610617390262,
 'awakened': 0.46836372570165224,
 'award': 0.327556629856786,
 'awareness': 0.3567942525949433,
 'awash': 0.42542765087892886,
 'away': 2.2543033747919194,
 'awe': 3.133764939836856,
 'awkward': 0.6431632867086736,
 'azure': 0.9847807777654817,
 'baby': 0.3373408058328683,
 'background': 0.424951870593631,
 'backpack': 0.29980957431763355,
 'backseat': 0.6485131984519917,
 'bad': 1.8267367984148897,
 'bag': 0.30853972755563086,
 'bake': 0.3783295097807415,
 'baked': 0.6472284612299019,
 'baking': 0.6333847846563401,
 'balancing': 0.9126374387668599,
 'ball': 0.4148954769167625,
 'ballroom': 1.6544174798621034,
 'ballroomdancing': 0.3149576786902122,
 'bamboo': 0.3537567362634633,
 'barefoot': 0.4053476388617273,
 'barrier': 0.3205756832510398,
 'basketball': 0.567763974907068,
 'basking': 1.3832366647170256,
 'bathed': 0.9464113893954904,
 'battle': 0.3872672857306431,
 'beach': 0.49469932578371956,
 'beacon': 0.41498794003460304,
 'beat': 0.3753617374099898,
 'beats': 0.8845069476296892,
 'beautiful': 1.7302387654089149,
 'beauty': 6.295252849722176,
 'beautyinaging': 0.34709956326803354,
 'begins': 0.4139828628672095,
 'believe': 0.5298902585041125,
 'bell': 0.31956510294758417,
 'belongs': 0.3404262344211208,
 'best': 1.3124067493859832,
 'betrayal': 1.9535280584745913,
 'betrayals': 0.5807289300224611,
 'better': 0.31909490979185945,
 'beyoncé': 0.3998272179125162,
 'bidding': 0.841658897586259,
 'bieber': 0.3373408058328683,
 'bike': 0.49061364830726495,
 'binge': 0.6167223212476525,
 'birthday': 2.4138534038865105,
 'bit': 0.9634976668345407,
 'biting': 0.3237724524126084,
 'bitter': 1.999300129581258,
 'bitterness': 1.535507504576376,
 'bittersweet': 1.9108148642197487,
 'blank': 0.32954259423419285,
 'blankets': 0.4315019470466966,
 'blending': 0.30855114407850215,
 'blessed': 0.47589363687555297,
 'blessing': 0.4154914443757559,
 'blessings': 0.9337023006208669,
 'block': 0.49309399935417453,
 'blog': 1.2415387205570223,
 'bloggerlife': 0.3490223173605548,
 'blogging': 0.3490223173605548,
 'bloom': 1.3551014561828838,
 'blooming': 0.9428742391102425,
 'blooms': 1.658156189221192,
 'blossoming': 0.39260137994903554,
 'blossoms': 0.3226724725378395,
 'blowout': 0.29969499842601083,
 'blues': 0.5162548923908687,
 'blurring': 0.32017406816094507,
 'boat': 0.6796221685958691,
 'bob': 0.3414753187796613,
 'boils': 0.4967269469501389,
 'bold': 0.6706362948221486,
 'bonding': 0.40401162424572873,
 'bonds': 0.30253666096892495,
 'book': 4.047400124808528,
 'bookclub': 0.35009800529938956,
 'booked': 0.3773610617390262,
 'booklover': 0.3244520111336667,
 'bookstore': 0.45115948387810856,
 'bookwormlife': 0.3423478993914908,
 'boredom': 1.2515446799753847,
 'boring': 0.40665126509209093,
 'born': 0.3042787394797246,
 'bought': 0.2969436374375238,
 'boulder': 0.3462939561428735,
 'bounce': 0.3191063390889068,
 'boundaries': 0.32646996054961813,
 'boundless': 1.0086213959724488,
 'bout': 0.32275370855230484,
 'boxer': 0.32275370855230484,
 'bracelets': 0.3379910939010853,
 'branches': 0.3738405608820981,
 'bread': 0.6472284612299019,
 'break': 0.3834882442010499,
 'breaking': 0.33052951420508303,
 'breakthrough': 0.32646996054961813,
 'breathtaking': 1.9027555837134533,
 'breeding': 0.34060871830099965,
 'breeze': 1.497768983192638,
 'brewed': 0.3128752074930167,
 'brewing': 0.51589643323856,
 'bright': 0.4779340674942243,
 'brightened': 0.5432736849787542,
 'brighter': 1.9633674858530294,
 'brilliance': 0.4017905277290101,
 'bring': 0.7804452714228927,
 'bringing': 0.38049545223727266,
 'brings': 0.8007370972583754,
 'broken': 1.9626352886887601,
 'brought': 0.961092891535172,
 'brunch': 0.6597227595443084,
 'bruno': 0.35482072637026213,
 'brushstrokes': 0.6873852932171127,
 'building': 2.0872228824583026,
 'builds': 0.8930690377770403,
 'built': 0.3416947911606543,
 'burger': 0.5999687512981846,
 'burning': 0.7265773654394676,
 'burst': 1.2395519739608778,
 'business': 0.5100599131609056,
 'bustling': 0.8524600303030342,
 'busy': 0.5496229565649215,
 'bygone': 0.6350078621232875,
 'cacophony': 0.7304902404143544,
 'cafeteria': 0.39915992476024736,
 'café': 0.31594178495245473,
 'calm': 0.4152710804081077,
 'calmness': 1.864699724863564,
 'camaraderie': 0.4765055994083423,
 'camera': 0.3724014818249699,
 'canal': 0.37406470640804307,
 'candy': 0.9428549421626302,
 'canvas': 1.9582222559782692,
 'canvases': 0.32954259423419285,
 'canyon': 0.3825092367185883,
 'capped': 0.33740068333895656,
 'capsule': 0.3671873736813497,
 'captivated': 1.5050685315049555,
 'captivating': 1.2388161974378384,
 'capture': 0.3671873736813497,
 'capturing': 1.8159273939927858,
 'car': 0.31739722838260637,
 'care': 0.9725770851209639,
 'career': 0.772978632753482,
 'carefree': 0.40581813303410114,
 'carnival': 1.6384053173147284,
 'carousel': 0.9428549421626302,
 'carried': 0.38510960452003473,
 'carrying': 1.0192828229626056,
 'cartoonnostalgia': 0.3908873119187563,
 'cartoons': 0.3908873119187563,
 'cartwheel': 0.3753617374099898,
 'cartwheels': 0.3753617374099898,
 'cast': 0.2859674149371948,
 'casting': 1.5166948392912454,
 'cathedral': 0.8423508661283768,
 'caught': 2.02525001443157,
 'causes': 0.6800724119668864,
 'causing': 0.28573584956036324,
 'cease': 0.3254827837095463,
 'celebrate': 0.33830408650662197,
 'celebrating': 2.4790618642205686,
 'celebration': 2.0427348643597254,
 'celebrations': 0.34854681495707573,
 'celestial': 0.32508160420491017,
 'cemetery': 0.3969350164876397,
 'chains': 0.3623551364336947,
 'challenge': 1.7392227528517714,
 'challenges': 5.066495397765265,
 'challenging': 4.260824981733045,
 'chamber': 0.7481242440982522,
 'chambers': 0.39986207605715257,
 'championship': 1.3457449702120496,
 'chance': 0.2851473344597939,
 'chandeliers': 0.44755001072360534,
 'change': 0.6646401698790956,
 'changing': 0.8227932037544496,
 'chaos': 1.7638912958592041,
 'characters': 0.9116469300318986,
 'charity': 2.2282947765787737,
 'charityrun': 0.3205756832510398,
 'charm': 0.9032396827144603,
 'charting': 0.6846275825612989,
 'chasing': 1.116389613246776,
 'chat': 0.3213098046984416,
 'check': 0.4607806880327846,
 'cheering': 0.29270231348496456,
 'cheers': 0.526990553529943,
 'chef': 0.6014661795736115,
 'chefmode': 0.34441939679344385,
 'chemistry': 0.3665977053493746,
 'cherished': 2.5044330475182845,
 'chest': 0.7162164093614751,
 'child': 0.3514197354141253,
 'childhood': 1.559285079994573,
 'children': 0.4649921855379401,
 'chill': 0.32345911793236115,
 'chilly': 0.8693607191308917,
 'china': 0.3666567204706499,
 'chips': 0.30853972755563086,
 'choices': 0.8476352719988425,
 'choir': 0.30253666096892495,
 'choosing': 0.29723468436514405,
 'chord': 0.3414753187796613,
 'chords': 0.7110242038179502,
 'chorus': 0.4228077004503558,
 'cinematic': 0.6789661076909611,
 'circulating': 0.3404723052478099,
 'circumstances': 0.7636314613270277,
 'circus': 0.6384443066666247,
 'city': 3.3765887241238284,
 'civilization': 1.0389700646464373,
 'civilizations': 0.3654786122349122,
 'claim': 0.29270231348496456,
 'claimed': 0.3211426558013953,
 'clarity': 0.4260104582275417,
 'class': 1.8685669054348137,
 'classcountdown': 0.31956510294758417,
 'classes': 0.6751697208917486,
 'classic': 0.6270887612955311,
 'classical': 0.298891067644867,
 'classicalmusic': 0.298891067644867,
 'classicrides': 0.31739722838260637,
 'classics': 0.31739722838260637,
 'classmates': 1.2500195255767648,
 'classroom': 0.9807341276396778,
 'cleaner': 0.3567942525949433,
 'cleanup': 0.3567942525949433,
 'click': 0.3185020720715461,
 'clicks': 0.4109610918717637,
 'cliff': 0.43715250016931867,
 'climber': 0.34073655001336917,
 'clinking': 0.2993859278415445,
 'cloak': 0.7886324032213217,
 'clock': 0.31956510294758417,
 'close': 0.3762614175044324,
 'closing': 0.7192756247511053,
 'cloud': 0.3354968873742172,
 'clouding': 0.4422443906669401,
 'clouds': 2.4052836780377214,
 'club': 2.4602395248243303,
 'clues': 0.2990578615716855,
 'coastal': 0.43715250016931867,
 'cocoa': 1.2190552340569156,
 'coding': 1.6382387125709106,
 'codingjourney': 0.4368523087555226,
 'coffee': 1.4384044391492374,
 'cold': 0.6534765169892873,
 'colder': 0.5079175854636201,
 'coldplay': 0.41498794003460304,
 'collaborating': 0.9505460172342716,
 'collaboration': 0.38077458568426986,
 'collection': 0.30855114407850215,
 'college': 0.374373821963917,
 'color': 0.3353181474110743,
 'colors': 2.850130002002085,
 'colosseum': 0.370101017068995,
 'comeback': 0.29751782762710344,
 'comedy': 1.5333180342231483,
 'comet': 0.4017905277290101,
 'coming': 0.493277794436099,
 'comments': 0.9434772142983775,
 'communication': 0.3232285482726077,
 'community': 3.4809802900851317,
 'communitychoir': 0.30253666096892495,
 'communitygarden': 0.3155470844228972,
 'companion': 2.61940912186479,
 'companionship': 0.610088082782155,
 'company': 0.37339268531532505,
 'compassion': 2.7264720781248992,
 'compassionate': 1.206047880080092,
 'compete': 0.32044895525778627,
 'competition': 0.8982418768938755,
 'completing': 2.749646745138894,
 'complex': 0.36693443970957534,
 'complexities': 0.42914973110448473,
 'composing': 0.7266253438714736,
 'concealing': 0.3481010768344638,
 'concert': 4.250912958869969,
 'concertvibes': 0.3828136059131857,
 'conference': 0.5488788943232074,
 'confetti': 0.36349016178240584,
 'confidence': 1.3329255602929504,
 'confident': 0.9768297982614749,
 'conflicting': 1.508528502748339,
 'conformity': 0.3623551364336947,
 'confusion': 2.718381541673232,
 'connected': 0.30452323687539495,
 'connecting': 1.1350703720916338,
 'connection': 2.534966477737142,
 'connections': 0.9373979133510937,
 'connoisseur': 0.5782489303634141,
 'conquered': 0.3211426558013953,
 'conquering': 1.6170829292896947,
 'conquers': 0.34073655001336917,
 'conscious': 0.2891674535609628,
 'consecutive': 0.603662318273856,
 'console': 0.3189460900248544,
 'constant': 1.7954255386398752,
 'constellation': 0.34655123518181297,
 'constellations': 0.7322423725692636,
 'consumes': 0.5618547925319187,
 'contact': 0.32877518535606975,
 'contemplating': 0.7908753949201264,
 'contemplation': 1.0588841663106987,
 'contentment': 3.7539621781229378,
 'contributing': 0.401928707826494,
 'conversation': 1.1441423124433858,
 'convinced': 0.3952678502446028,
 'cook': 0.30853972755563086,
 'cooked': 1.088994060549772,
 'cooking': 1.022169731413498,
 'corners': 0.3042787394797246,
 'corruption': 0.95192743694273,
 'cosmic': 0.47054438075144916,
 'cosmos': 1.4491529457143628,
 'costs': 0.2902534443080849,
 'costume': 0.37079113268062636,
 'costumes': 0.4148954769167625,
 'cotton': 0.9428549421626302,
 'countdown': 0.4139828628672095,
 'country': 0.367549143621267,
 'course': 0.6846275825612989,
 'covered': 0.28613832082444035,
 'coveting': 0.5053987280530116,
 'cozy': 1.7994698545978538,
 'crafted': 0.6666835840172292,
 'crafting': 1.0038843910791244,
 'crash': 0.30363475689059777,
 'crashing': 0.4198212725535301,
 'cravings': 0.29980957431763355,
 'create': 1.5177585224079029,
 'creates': 0.8992384199303038,
 'creating': 3.0920498450479177,
 'creation': 0.31421554828615156,
 'creative': 0.887169971383424,
 'creativity': 3.643616357691352,
 'creatures': 0.4342039891733066,
 'credits': 0.605918906086786,
 'creeps': 0.5079175854636201,
 'crevice': 0.4273023554438699,
 'cricket': 0.3237724524126084,
 'crime': 0.3050445002932264,
 'crisis': 0.3592649102236794,
 'critical': 0.3232285482726077,
 'crossroads': 0.8537775239954382,
 'crowd': 0.9040582869360801,
 'crowded': 0.515232211589256,
 'crucial': 0.5421620036994119,
 'cruelty': 0.5242743934773993,
 'cruise': 0.6796221685958691,
 'cruising': 0.37342356886019007,
 'crumbles': 0.4586512203428681,
 'crush': 0.8731650351530714,
 'crushing': 0.8085749064876369,
 'crystal': 0.44755001072360534,
 'culinary': 1.1434715994918752,
 'cultural': 0.7689223131366024,
 'culture': 0.4895137486502935,
 'cultured': 0.35053708745897405,
 'cup': 2.946914667604602,
 'curiosity': 3.3429056248443407,
 'currency': 0.42675037411645655,
 'current': 0.38510960452003473,
 'currents': 0.39410851592682283,
 'curtain': 0.6485131984519917,
 'customer': 0.47862280012533687,
 'cute': 0.9159299005606687,
 'cyberbullying': 0.43742516722970654,
 'cycling': 0.874428365208808,
 'cyclingclub': 0.33158529903880074,
 'cyclist': 0.29969499842601083,
 'dabbling': 0.3656836831644756,
 'daily': 0.8628075454959219,
 'dance': 4.794990925087379,
 'danceallnight': 0.3828136059131857,
 'danceclass': 0.30107979166337545,
 'danced': 0.6787671527705726,
 'dancer': 0.31770747638869457,
 'dancing': 2.8377758110040783,
 'dangle': 0.32345911793236115,
 'dark': 1.2217363958746583,
 'darker': 0.6152466746676181,
 'darkest': 0.3042787394797246,
 'darkness': 0.8547964870908928,
 'dawn': 0.9635438580066885,
 'day': 7.709537627797752,
 'daydreaming': 0.4183669129595076,
 'days': 1.093348218502129,
 'dazzled': 0.4148954769167625,
 'dazzles': 0.32993523379531836,
 'dazzling': 0.4148954769167625,
 'dealing': 1.0223312195774539,
 'dear': 0.841658897586259,
 'dearly': 0.5218060076272676,
 'debate': 0.6687691057522764,
 'debris': 0.32647078582563344,
 'debugging': 0.4368523087555226,
 'decided': 0.6103957789572774,
 'decisions': 2.2811311104275505,
 'decor': 0.31909490979185945,
 'dedication': 0.7799920030970569,
 'deep': 0.36159005264448374,
 'deepens': 0.8802868803773027,
 'deeper': 0.667681198385463,
 'defeat': 0.9117262460617488,
 'defeats': 0.33658820101435405,
 'defies': 0.29270231348496456,
 'delicious': 0.6374284770751699,
 'delight': 1.5373493973541157,
 'delights': 0.28613832082444035,
 'demeanor': 0.45110746677164937,
 'department': 0.47862280012533687,
 'depths': 0.29662040142739954,
 'derived': 0.47180579495503483,
 'descend': 0.4073787502124275,
 'descends': 0.7192460414210639,
 'desert': 0.3509356882990055,
 'designer': 0.30855114407850215,
 'desire': 0.5873300497508636,
 'despair': 4.493986892928637,
 'desperation': 0.4309885300208834,
 'despite': 0.8001532452796221,
 'dessert': 0.6081969385882942,
 'destination': 0.5192799157728847,
 'detached': 0.4380209494156834,
 'details': 0.4817845574689075,
 'determination': 2.4214032995065575,
 'determined': 0.9064515240477109,
 'devastated': 0.9584230356847581,
 'development': 0.34563231070946804,
 'devour': 0.34013902827950226,
 'diary': 0.7119483850411215,
 'did': 0.33160017547958875,
 'didn': 0.3558192806841566,
 'difference': 0.5487977055299458,
 'different': 0.3709682616235439,
 'difficult': 0.4614976422041141,
 'digital': 0.7273889050433409,
 'digitalartistry': 0.3670855011666013,
 'dinner': 1.2153939053537561,
 'disappointed': 1.0085732286675064,
 'disappointment': 1.8676765644720565,
 'disaster': 0.4472135954999579,
 'discontent': 0.34060871830099965,
 'discover': 0.3670855011666013,
 'discovered': 0.3423478993914908,
 'discovering': 1.8394433262427878,
 'discovers': 0.3353748684371222,
 'discovery': 1.0739676544534915,
 'discussions': 0.8070265276628252,
 'disgust': 0.9908735004323876,
 'disgusting': 0.9507144998190568,
 'disheartened': 0.374373821963917,
 'disheartening': 0.43742516722970654,
 'dismissive': 0.9271936581133846,
 'disneyland': 0.40152448655623363,
 'display': 0.6375328929014044,
 'displayed': 0.538580860765043,
 'distant': 1.0136911364953538,
 'diverse': 0.6456758614112259,
 'diversity': 0.9772568756837439,
 'diving': 0.9295914904714375,
 'diy': 1.192187165931752,
 'diyadventure': 0.31909490979185945,
 'documentaries': 0.34650222310820444,
 'documenting': 0.332649626356379,
 'don': 0.2910586937352798,
 'doodle': 0.40665126509209093,
 'doodles': 0.40665126509209093,
 'door': 0.4140381694807299,
 'double': 0.49357301667208553,
 'doubt': 0.31988618935060087,
 'downs': 0.5397315808343396,
 'drama': 0.9606343980276186,
 'draped': 0.4342380144238378,
 'drawn': 0.6485131984519917,
 'dream': 1.0852831075302745,
 'dreams': 6.12645651577954,
 'drenched': 0.7747432912195215,
 'dress': 0.4183669129595076,
 'dressed': 0.3803543870010392,
 'drifting': 0.8486239609808142,
 'drink': 0.29096848119888474,
 'drivers': 0.32044895525778627,
 'driving': 0.7217364501758325,
 'drowning': 1.921168010049471,
 'dull': 0.7574101009904726,
 'dust': 0.43946840669893117,
 'duty': 0.2999843756490923,
 'eagerly': 0.28613832082444035,
 'earning': 0.30658800725050434,
 'earth': 0.3825092367185883,
 'easy': 0.3242165636339132,
 'eat': 0.2969436374375238,
 'eats': 0.5326449252526559,
 'ebb': 0.43645848596811176,
 'echo': 1.7534278554602718,
 'echoed': 0.2993859278415445,
 'echoes': 4.907354840630934,
 'echoing': 0.9999027045453277,
 'ecstasy': 0.34252693520041877,
 'ed': 0.36269358335367163,
 'edge': 0.8539615729939463,
 'efficiency': 0.36460646069064007,
 'effort': 0.5998027060587965,
 'eiffel': 0.38899509839154855,
 'elaborate': 0.3168507253369605,
 'elation': 3.121434077344014,
 'elegance': 1.0591943221943323,
 'elusive': 0.3141506462672826,
 'embarked': 1.1206936181759592,
 'embarking': 2.611117137531606,
 'embarrassment': 0.3245670446605808,
 'embrace': 2.522875774551816,
 'embraced': 0.9635438580066885,
 'embraces': 0.30363475689059777,
 'embracing': 4.1196988961091465,
 'emergency': 0.6203676820557117,
 'emotion': 1.1040972999918115,
 'emotional': 2.365209818225845,
 'emotions': 4.825361494905211,
 'empathetic': 0.35446415838656753,
 'empathy': 1.9745874905517586,
 'empire': 0.3694156419108118,
 'empowered': 0.9838040857124203,
 'empowerment': 1.6804962277613833,
 'emptiness': 0.3762614175044324,
 'enchanting': 0.736509371280053,
 'enchantment': 0.40152448655623363,
 'encountered': 0.3242165636339132,
 'encountering': 0.3571930736120489,
 'encourage': 0.3189460900248544,
 'end': 0.7155396882239565,
 'ended': 0.5514313811574505,
 'ending': 0.3210986479440757,
 'endless': 0.8100966201005839,
 'endlessly': 0.8898330328826118,
 'endurance': 0.33052951420508303,
 'energy': 0.9502820755119048,
 'engagement': 0.3232285482726077,
 'engineering': 0.3666567204706499,
 'engines': 0.39438310074663824,
 'engrossed': 0.34755910658016137,
 'engulfed': 0.3128752074930167,
 'engulfing': 0.4779838727062616,
 'engulfs': 0.891661031115522,
 'enhance': 0.6474391288934083,
 'enjoying': 2.7122363406390475,
 'enjoyment': 0.5223479675228202,
 'enlightened': 0.34650222310820444,
 'enrolled': 0.30107979166337545,
 'entangled': 0.40113431372170966,
 'entered': 0.30853972755563086,
 'enthusiasm': 3.1683416305067196,
 'enthusiast': 1.3979183080339104,
 'enthusiastically': 0.6100709028546328,
 'enthusiasts': 0.38077458568426986,
 'entwining': 0.3398055242478306,
 'enveloped': 0.4987364032173856,
 'enveloping': 0.32017406816094507,
 'envelops': 0.39303693936772865,
 'envious': 0.8185865492269637,
 'environment': 0.5271343831245071,
 'environmental': 0.3567942525949433,
 'envisioning': 0.3446624017353284,
 'envy': 0.9770803606849594,
 'equations': 0.7532573314883182,
 'era': 0.7313907942141284,
 'eras': 0.3446624017353284,
 'erased': 0.42413441020170045,
 'eruption': 0.4967269469501389,
 'erupts': 0.5903035615266332,
 'escalates': 0.5723274724649586,
 'escapade': 0.9428549421626302,
 'escape': 0.6647483209515617,
 'essence': 0.8330228547951032,
 'eternal': 0.3019792719693925,
 'ethereal': 0.4663680732632333,
 'euphoria': 2.515862670174713,
 'euphoric': 0.3727084418531491,
 'evening': 3.805904579058833,
 'event': 2.3231180206127195,
 'events': 1.1820356703443204,
 'everyday': 0.391850806001502,
 'evoking': 0.6537933124958586,
 'exam': 0.40498771083255314,
 'exams': 0.9033272162508643,
 'exceptional': 0.3043413668575558,
 'excited': 1.0974940746802657,
 'excitement': 5.1833738945193595,
 'exciting': 0.31956510294758417,
 'exhaustion': 0.726456949387188,
 'exhibition': 1.379484770057406,
 'exhilarating': 0.470809204055908,
 'existence': 0.43645848596811176,
 'exotic': 0.9278863649261648,
 'expanding': 0.5108974553243968,
 'expanse': 0.32508160420491017,
 'expectations': 0.916647700410269,
 'expected': 0.4103496212745851,
 'experience': 1.166556822960476,
 'experienced': 0.4895137486502935,
 'experiences': 1.5192541154087158,
 'experiencing': 2.6461850619533327,
 'experiment': 0.9487321943483422,
 'experimenting': 0.671735825435106,
 'expert': 1.2158106256122976,
 'exploration': 0.6745158239966558,
 'explore': 1.6984511995767655,
 'explorer': 0.37693494894749824,
 'exploring': 5.023043862986883,
 'expresses': 0.31770747638869457,
 'expressing': 0.36736209065974584,
 'expression': 0.3259619897120836,
 'extending': 0.35446415838656753,
 'extracurricular': 0.3656836831644756,
 'extraordinary': 0.37663683861075475,
 'exuberance': 0.40581813303410114,
 'eye': 0.32877518535606975,
 'eyed': 0.3354968873742172,
 'eyes': 1.377333506495089,
 'facade': 0.3481010768344638,
 'face': 0.39909437317547497,
 'faces': 0.9062025351912397,
 'facing': 1.0007160897825336,
 'fade': 0.4016461815491547,
 'fail': 1.1402304330210624,
 'failed': 0.3099173613121918,
 'failure': 0.3558192806841566,
 'fair': 0.6546596671667766,
 'fairy': 1.0718141008107769,
 'fairytale': 0.4183669129595076,
 'faith': 0.8535481129643043,
 'fall': 1.574231589905133,
 'falling': 0.9373979133510937,
 'family': 3.9042866973439647,
 'familydinner': 0.2993859278415445,
 'familyrecipes': 0.3247158021410614,
 'fangirling': 0.40401162424572873,
 'fans': 1.2963167671635205,
 'fantasies': 0.40152448655623363,
 'fantasy': 0.3775208917183041,
 'farewell': 0.841658897586259,
 'farewells': 0.3762614175044324,
 'fascinated': 0.3365171256176993,
 'fashion': 0.6219713782616831,
 'fashionista': 0.35222709660756846,
 'favorite': 1.5901723842702733,
 'favorites': 0.4431234494438136,
 'fear': 1.3847375184506687,
 'fearful': 0.8799669841344311,
 'fearless': 0.37693494894749824,
 'featuring': 0.3254649519040373,
 'feeling': 7.546743506424512,
 'feelings': 0.4117895265323813,
 'feels': 0.8494749044713439,
 'feet': 0.337241311519761,
 'fellow': 0.6431810427373167,
 'fence': 0.34597669424634986,
 'ferrari': 0.39438310074663824,
 'fervor': 0.4227660386622774,
 'festering': 0.6455068968090063,
 'festers': 0.9760419784379777,
 'festival': 1.7607778549404443,
 'field': 0.7817946500322855,
 'fields': 0.6872501109502818,
 'fierce': 0.30569239284002225,
 'fiery': 0.383107951065156,
 'fight': 0.4073787502124275,
 'figure': 0.3536118638416316,
 'filled': 2.2977751318176316,
 'fills': 0.39303693936772865,
 'film': 0.31421554828615156,
 'filmmaker': 0.31421554828615156,
 'films': 0.341762903129689,
 'filter': 0.29723468436514405,
 'final': 0.7699802888053375,
 'finals': 0.5555570871464028,
 'finding': 3.9801049579157493,
 'finds': 0.5717745497842233,
 'fine': 0.2633595947226895,
 'finest': 0.40401162424572873,
 'fingers': 1.40146114633552,
 'finish': 0.3237724524126084,
 'finished': 1.034572402860305,
 'fireflies': 0.8818884982184191,
 'fireplace': 0.47040402002608617,
 'fireworks': 0.4180774718547827,
 'fitness': 2.2409351166719427,
 'fits': 0.4070608274353992,
 'fix': 0.41498794003460304,
 'fixated': 0.30982487823531607,
 'fjords': 0.37342356886019007,
 'flag': 1.0043278411973757,
 'flat': 0.34674170250247105,
 'flavors': 1.585992064983394,
 'flaws': 0.4373408576381327,
 'flipping': 1.5258422962673794,
 'floating': 1.4099830456048783,
 'flood': 0.4431234494438136,
 'flooding': 0.3653335511582091,
 'floods': 0.7699802888053375,
 'floor': 1.3972438074903288,
 'floralbeauty': 0.3254827837095463,
 'flow': 0.9877031412392351,
 'flower': 0.3226724725378395,
 'flowers': 1.367406507953813,
 'flowing': 0.40807726203181066,
 'fly': 0.34288387222799527,
 'focus': 0.3043413668575558,
 'fog': 0.32017406816094507,
 'followers': 1.040834290321596,
 'food': 0.6560366235007976,
 'football': 0.3344322500804022,
 'footprints': 0.3211426558013953,
 'footsteps': 0.47169571916004954,
 'force': 0.37663683861075475,
 'forecast': 0.31988618935060087,
 'forest': 1.5497566210124705,
 'forever': 0.5873300497508636,
 'forging': 0.3743133741652681,
 'forgot': 0.2969436374375238,
 'forgotten': 1.515429125635107,
 'formula': 0.32044895525778627,
 'fortress': 0.690026604616649,
 'forward': 0.37663683861075475,
 'fragile': 1.107625532650779,
 'fragments': 1.322374383887736,
 'frame': 0.3173115760148589,
 'framed': 0.39894296430742593,
 'frank': 0.34288387222799527,
 'freddie': 0.3624192558555527,
 'free': 1.6144662116014208,
 'freedom': 0.5979142008000446,
 'freely': 0.6226271210752686,
 'freezes': 0.3173115760148589,
 'freezing': 0.32345911793236115,
 'fresh': 0.3952678502446028,
 'freshly': 0.9000396375869744,
 'friend': 3.829341207365952,
 'friendly': 0.34854681495707573,
 'friends': 4.7530468438124736,
 'friendship': 1.817818104229037,
 'friendships': 0.7455362012327929,
 'frontiers': 0.32646996054961813,
 'frosty': 0.28613832082444035,
 'frustrated': 0.9063347977316746,
 'frustration': 2.7496652899164395,
 'fueled': 1.8950839047804904,
 'fuels': 0.4152710804081077,
 'fuji': 0.3796288772550384,
 'fulfilling': 0.3210986479440757,
 'fulfillment': 2.036056368054229,
 'fuming': 0.5148908673953384,
 'fun': 0.3571930736120489,
 'fundraising': 0.41445036149237724,
 'funk': 0.35482072637026213,
 'furry': 0.49860558594731424,
 'future': 1.2667781522239003,
 'gaga': 0.37079113268062636,
 'gained': 0.2851473344597939,
 'gaining': 0.34563231070946804,
 'galleries': 0.5684058584548449,
 'gallery': 0.659757756094864,
 'galore': 0.33410731974665386,
 'game': 0.2969436374375238,
 'gamerlife': 0.2969436374375238,
 'gaming': 1.5504475476996682,
 'garden': 4.670928509987437,
 'gardener': 1.2087733948211323,
 'gardenwalk': 0.3254827837095463,
 'gathering': 0.5133236863911257,
 'gazes': 0.2859674149371948,
 'gazing': 0.7447066070694122,
 'geek': 0.3168507253369605,
 'gem': 1.2511675404459397,
 'gems': 0.9029314729671042,
 'generation': 0.30658800725050434,
 'generations': 0.298891067644867,
 'gently': 0.9373979133510937,
 'gestures': 0.6485131984519917,
 'getaway': 0.9053565724313186,
 'ghost': 0.8043066013750523,
 'giddy': 0.5405942746143327,
 'gift': 1.0117302635163643,
 'gig': 0.3384290672840039,
 'giggles': 0.4649921855379401,
 'gilded': 0.30982487823531607,
 'giving': 0.41445036149237724,
 'gladiator': 0.370101017068995,
 'glance': 0.34597669424634986,
 'glances': 0.3416947911606543,
 'glass': 1.6308897575255545,
 'gliding': 0.3149576786902122,
 'glimmer': 0.4309885300208834,
 'globe': 0.367549143621267,
 'glow': 2.0299417210872757,
 'gnaws': 0.48613178032374893,
 'goal': 0.8709319006771452,
 'goals': 1.3203368845766692,
 'going': 0.5440359469135485,
 'gold': 0.30658800725050434,
 'golden': 1.65015462235594,
 'golf': 0.5554225562803211,
 'golfer': 0.5554225562803211,
 'gondola': 0.37406470640804307,
 'gone': 0.6229756788348326,
 'good': 2.6088853442479936,
 'got': 1.4057516254320104,
 'gourmet': 0.34441939679344385,
 'grace': 0.6803661438070719,
 'graced': 0.3578589322497488,
 'graceful': 0.31770747638869457,
 'gracefully': 0.3149576786902122,
 'graciously': 0.327556629856786,
 'grade': 0.3949798714789474,
 'grains': 0.3277188234218516,
 'grand': 0.6477842182183233,
 'grande': 0.3727084418531491,
 'grandeur': 2.0715513542720267,
 'grapples': 0.31048634370462463,
 'grateful': 0.9593970833947998,
 'gratefulness': 0.680633744225233,
 'gratitude': 4.599294065155245,
 'great': 0.9716949748651718,
 'green': 0.6128087935026152,
 'grief': 2.991472310659767,
 'grip': 0.44675288638571753,
 'grips': 0.8825002533665771,
 'ground': 0.34060871830099965,
 'grounds': 0.34859377615871173,
 'group': 1.4319857135639085,
 'groupprojectsuccess': 0.3797102270252176,
 'groves': 0.3537567362634633,
 'grow': 0.2851473344597939,
 'growing': 0.3155470844228972,
 'grows': 0.9093972917367815,
 'growth': 2.001223003670254,
 'guiding': 0.8995397922810339,
 'guitar': 0.3624192558555527,
 'guns': 0.3514197354141253,
 'gymnast': 0.33013507878417037,
 'hacky': 0.5319301603112785,
 'hair': 1.1202189968559566,
 'hallway': 0.32877518535606975,
 'hand': 0.35446415838656753,
 'handcrafted': 0.4817845574689075,
 'hands': 0.9767888971995164,
 'handshake': 0.4432164517332948,
 'handstand': 0.3834882442010499,
 'happen': 0.3797102270252176,
 'happening': 0.5298902585041125,
 'happenings': 0.4380209494156834,
 'happiness': 2.6991990224162206,
 'hard': 0.3558192806841566,
 'harder': 0.7697039807627255,
 'harmonizing': 0.30253666096892495,
 'harmony': 1.195996454794414,
 'harmonyinaging': 0.30253666096892495,
 'hate': 0.6509629135801759,
 'hateful': 0.74795108557914,
 'haunted': 1.0751726153204855,
 'haunting': 0.3751427506886121,
 'haunts': 0.49657141365774987,
 'having': 0.5866704529909749,
 'head': 0.4152710804081077,
 'headbanging': 0.34252693520041877,
 'headphonemystery': 0.45380123838212355,
 'headphones': 0.45380123838212355,
 'heal': 0.750445961506484,
 'health': 0.5279188064263891,
 'healthy': 0.3195327462981579,
 ...}

Plotting the WordCloud for common words in text column¶

In [8]:
# Generate and display a word cloud
wordcloud = WordCloud(background_color='white')
wordcloud.generate_from_frequencies(freqs)

plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()
In [9]:
import matplotlib.pyplot as plt
df['Platform'] = df['Platform'].str.strip()
colors = ['lightblue', 'lightgreen', 'lightcoral']
sizes = df['Platform'].value_counts()

labels = sizes.index

index = sizes.values

plt.figure(figsize=(8,6))
plt.pie(index, labels=labels, colors=colors, autopct='%1.1f%%', startangle=140)
plt.axis('equal')  # Equal aspect ratio ensures that pie is drawn as a circle.
plt.show()

The pie chart above describes the sentiment distribution for various social media websites.Instagram has the largest share with 35.2%, followed closely by Twitter at 33.2%, and Facebook at 31.6%.¶

In [10]:
df.head(40)
Out[10]:
Unnamed: 0.1 Unnamed: 0 Text Sentiment Timestamp User Platform Hashtags Retweets Likes Country Year Month Day Hour
0 0 0 Enjoying a beautiful day at the park! ... Positive 2023-01-15 12:30:00 User123 Twitter #Nature #Park 15.0 30.0 USA 2023 1 15 12
1 1 1 Traffic was terrible this morning. ... Negative 2023-01-15 08:45:00 CommuterX Twitter #Traffic #Morning 5.0 10.0 Canada 2023 1 15 8
2 2 2 Just finished an amazing workout! 💪 ... Positive 2023-01-15 15:45:00 FitnessFan Instagram #Fitness #Workout 20.0 40.0 USA 2023 1 15 15
3 3 3 Excited about the upcoming weekend getaway! ... Positive 2023-01-15 18:20:00 AdventureX Facebook #Travel #Adventure 8.0 15.0 UK 2023 1 15 18
4 4 4 Trying out a new recipe for dinner tonight. ... Neutral 2023-01-15 19:55:00 ChefCook Instagram #Cooking #Food 12.0 25.0 Australia 2023 1 15 19
5 5 5 Feeling grateful for the little things in lif... Positive 2023-01-16 09:10:00 GratitudeNow Twitter #Gratitude #PositiveVibes 25.0 50.0 India 2023 1 16 9
6 6 6 Rainy days call for cozy blankets and hot coc... Positive 2023-01-16 14:45:00 RainyDays Facebook #RainyDays #Cozy 10.0 20.0 Canada 2023 1 16 14
7 7 7 The new movie release is a must-watch! ... Positive 2023-01-16 19:30:00 MovieBuff Instagram #MovieNight #MustWatch 15.0 30.0 USA 2023 1 16 19
8 8 8 Political discussions heating up on the timel... Negative 2023-01-17 08:00:00 DebateTalk Twitter #Politics #Debate 30.0 60.0 USA 2023 1 17 8
9 9 9 Missing summer vibes and beach days. ... Neutral 2023-01-17 12:20:00 BeachLover Facebook #Summer #BeachDays 18.0 35.0 Australia 2023 1 17 12
10 10 10 Just published a new blog post. Check it out!... Positive 2023-01-17 15:15:00 BloggerX Instagram #Blogging #NewPost 22.0 45.0 USA 2023 1 17 15
11 11 11 Feeling a bit under the weather today. ... Negative 2023-01-18 10:30:00 WellnessCheck Twitter #SickDay #Health 7.0 15.0 Canada 2023 1 18 10
12 12 12 Exploring the city's hidden gems. ... Positive 2023-01-18 14:50:00 UrbanExplorer Facebook #CityExplore #HiddenGems 12.0 25.0 UK 2023 1 18 14
13 13 13 New year, new fitness goals! 💪 ... Positive 2023-01-18 18:00:00 FitJourney Instagram #NewYear #FitnessGoals 28.0 55.0 USA 2023 1 18 18
14 14 14 Technology is changing the way we live. ... Neutral 2023-01-19 09:45:00 TechEnthusiast Twitter #Tech #Innovation 15.0 30.0 India 2023 1 19 9
15 15 15 Reflecting on the past and looking ahead. ... Positive 2023-01-19 13:20:00 Reflections Facebook #Reflection #Future 20.0 40.0 USA 2023 1 19 13
16 16 16 Just adopted a cute furry friend! 🐾 ... Positive 2023-01-19 17:10:00 PetAdopter Instagram #PetAdoption #FurryFriend 15.0 30.0 Canada 2023 1 19 17
17 17 17 Late-night gaming session with friends. ... Positive 2023-01-20 00:05:00 GamerX Twitter #Gaming #LateNight 18.0 35.0 UK 2023 1 20 0
18 18 18 Attending a virtual conference on AI. ... Neutral 2023-01-20 11:30:00 TechConference Facebook #AI #TechConference 25.0 50.0 USA 2023 1 20 11
19 19 19 Winter blues got me feeling low. ... Negative 2023-01-20 15:15:00 WinterBlues Instagram #WinterBlues #Mood 8.0 15.0 USA 2023 1 20 15
20 20 20 Sipping coffee and enjoying a good book. ... Positive 2023-01-21 08:40:00 Bookworm Twitter #Reading #CoffeeTime 22.0 45.0 India 2023 1 21 8
21 21 21 Exploring the world of virtual reality. ... Positive 2023-01-21 13:20:00 VRExplorer Facebook #VR #VirtualReality 15.0 30.0 USA 2023 1 21 13
22 22 22 Productive day ticking off my to-do list. ... Positive 2023-01-21 16:45:00 ProductivityPro Instagram #Productivity #WorkFromHome 30.0 60.0 USA 2023 1 21 16
23 23 23 Just finished a challenging workout routine. ... Positive 2023-01-22 09:15:00 FitnessWarrior Twitter #Fitness #ChallengeAccepted 20.0 40.0 UK 2023 1 22 9
24 24 24 Celebrating a milestone at work! 🎉 ... Positive 2023-01-22 14:30:00 CareerMilestone Facebook #Career #Milestone 12.0 25.0 Canada 2023 1 22 14
25 25 25 Sunday brunch with friends. ... Positive 2023-01-22 12:00:00 BrunchBuddy Instagram #Brunch #Friends 15.0 30.0 UK 2023 1 22 12
26 27 28 Learning a new language for personal growth. ... Positive 2023-01-23 16:20:00 LanguageLearner Facebook #LanguageLearning #PersonalGrowth 25.0 50.0 India 2023 1 23 16
27 28 29 Quiet evening with a good book. ... Positive 2023-01-23 19:45:00 BookLover Instagram #Reading #QuietTime 15.0 30.0 Australia 2023 1 23 19
28 29 30 Reflecting on the importance of mental health... Positive 2023-01-24 11:30:00 MentalHealthMatters Twitter #MentalHealth #SelfCare 22.0 45.0 USA 2023 1 24 11
29 30 31 New painting in progress! 🎨 ... Positive 2023-01-24 15:00:00 ArtistInAction Facebook #Art #PaintingInProgress 12.0 25.0 Canada 2023 1 24 15
30 31 32 Weekend road trip to explore scenic views. ... Positive 2023-01-24 17:30:00 RoadTripper Instagram #RoadTrip #ScenicViews 18.0 35.0 UK 2023 1 24 17
31 32 33 Enjoying a cup of tea and watching the sunset... Positive 2023-01-25 18:00:00 SunsetWatcher Twitter #TeaTime #Sunset 15.0 30.0 India 2023 1 25 18
32 33 34 Coding a new project with enthusiasm. ... Positive 2023-01-25 13:15:00 CodeEnthusiast Facebook #Coding #Enthusiasm 30.0 60.0 USA 2023 1 25 13
33 34 35 Feeling inspired after attending a workshop. ... Positive 2023-01-26 09:45:00 WorkshopAttendee Instagram #Inspiration #Workshop 25.0 50.0 USA 2023 1 26 9
34 35 36 Winter sports day at the local park. ... Positive 2023-01-26 14:20:00 WinterSports Twitter #WinterSports #Fun 15.0 30.0 Canada 2023 1 26 14
35 36 37 Quality time with family this weekend. ... Positive 2023-01-26 17:40:00 FamilyTime Facebook #FamilyTime #Weekend 22.0 45.0 UK 2023 1 26 17
36 37 38 Attending a live music concert tonight. ... Positive 2023-01-27 20:00:00 MusicLover Instagram #Music #ConcertNight 18.0 35.0 USA 2023 1 27 20
37 38 39 Practicing mindfulness with meditation. ... Positive 2023-01-27 12:30:00 MindfulMoments Twitter #Mindfulness #Meditation 15.0 30.0 India 2023 1 27 12
38 39 40 Trying out a new dessert recipe. ... Positive 2023-01-27 16:10:00 DessertExplorer Facebook #Dessert #Cooking 12.0 25.0 Canada 2023 1 27 16
39 40 41 Excited about the upcoming gaming tournament.... Positive 2023-01-28 09:00:00 GamingEnthusiast Instagram #Gaming #Tournament 30.0 60.0 USA 2023 1 28 9

Counting the number of likes for each Sentiment¶

In [11]:
df['Sentiment'] = df['Sentiment'].str.strip()
Sentiment = df.groupby('Sentiment')['Likes'].sum().reset_index()
In [12]:
Sentiment.head()
Out[12]:
Sentiment Likes
0 Acceptance 273.0
1 Accomplishment 155.0
2 Admiration 175.0
3 Adoration 90.0
4 Adrenaline 45.0
In [13]:
import matplotlib.pyplot as plt
import seaborn as sns
Sentiment = Sentiment.sort_values(by='Likes', ascending=False)


plt.figure(figsize=(60,20))
ax = sns.barplot(x='Sentiment', y='Likes', data=Sentiment, palette="viridis")

# Set the title and labels
ax.set_title('Total Likes for each Sentiment', fontsize=35)
ax.set_xlabel('Total Likes', fontsize=25)
ax.set_ylabel('Sentiment', fontsize=25)

# Rotate x-axis labels
ax.set_xticklabels(ax.get_xticklabels(), rotation=90, fontsize=20);
ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=20);

# Display the plot
plt.show()
C:\Users\BHAVIN\AppData\Local\Temp\ipykernel_32804\908055504.py:16: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=20);

From the plot above it can be observed that Sentiment Joy had highest likes of about 2400 followed by Excitement with approx 1800 likes followed by Positive with around 900likes, Contentment with 800likes.¶

In [14]:
import matplotlib.pyplot as plt
import seaborn as sns
Sentiment_1= Sentiment.sort_values(by='Likes', ascending=True).head(50)


plt.figure(figsize=(30,10))
ax = sns.barplot(x='Sentiment', y='Likes', data=Sentiment_1, palette="viridis")

# Set the title and labels
ax.set_title('Total Likes for each Sentiment',fontsize=20)
ax.set_xlabel('Total Likes', fontsize=18)
ax.set_ylabel('Sentiment', fontsize=18)

# Rotate x-axis labels
ax.set_xticklabels(ax.get_xticklabels(), rotation=90, fontsize=20);

# Display the plot
plt.show()

The sentiments such as Artistic, Confidence, Helplessness have lowest likes below 30 .The sentiments Positivity, Pressure, Overjoyed, Intrigue, Celestial Wonder, Melodic, Positive, Triumph, Blesssed have likes below 35.¶

Checking Countrywise distribution of likes¶

In [15]:
df['Country'] = df['Country'].str.strip()

Country_like = df[['Country','Likes']]

Country_like = Country_like.groupby('Country')['Likes'].sum().reset_index()

Country_like.head()
Out[15]:
Country Likes
0 Australia 2926.0
1 Austria 90.0
2 Belgium 140.0
3 Brazil 900.0
4 Cambodia 40.0
In [16]:
import matplotlib.pyplot as plt
import seaborn as sns
Country_like = Country_like.sort_values(by='Likes', ascending=False)

plt.figure(figsize=(30,20))
ax = sns.barplot(x='Country', y='Likes', data=Country_like, palette="inferno")

# Set the title and labels
ax.set_title('Total Likes for each Country', fontsize=20)
ax.set_xlabel('Total Likes', fontsize=20)
ax.set_ylabel('Country', fontsize=18)

# Rotate x-axis labels
ax.set_xticklabels(ax.get_xticklabels(), rotation=90, fontsize=20);
ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=20);

# Display the plot
plt.show()
C:\Users\BHAVIN\AppData\Local\Temp\ipykernel_32804\733594692.py:15: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=20);
In [17]:
Country_like.head()
Out[17]:
Country Likes
32 USA 8358.0
31 UK 5827.0
5 Canada 5488.0
0 Australia 2926.0
13 India 2675.0

The bar plot above describes the number of likes per country.USA has the highest number of likes of about 8358 followed by UK with around 5827, Canada at third with 5488 likes followed by Australia with 2926 and India with 2675 likes. China recorded lowest number of likes alongwith Cambodia, Kenya, Maldives , Norway and Scotland. This also depicts the popularity of Social media likes in these countries as well as the population using these websites.¶

In [18]:
df['Hashtags']=df['Hashtags'].str.strip()

df['Year'].unique()
Out[18]:
array([2023, 2010, 2021, 2011, 2022, 2012, 2013, 2014, 2015, 2016, 2017,
       2018, 2019, 2020], dtype=int64)
In [19]:
Hashtags = df[['Hashtags','Retweets','Year']]

Hashtags_2023 = Hashtags[Hashtags['Year']== 2023]
Hashtags_2022 = Hashtags[Hashtags['Year']== 2022]
Hashtags_2021 = Hashtags[Hashtags['Year']== 2021]
Hashtags_2020 = Hashtags[Hashtags['Year']== 2020]
Hashtags_2019 = Hashtags[Hashtags['Year']== 2019]
#Hashtags = Hashtags.groupby('Hashtags')['Retweets'].sum().reset_index()

#Hashtags = Hashtags.head(50)

Hashtags_2023.head()
Out[19]:
Hashtags Retweets Year
0 #Nature #Park 15.0 2023
1 #Traffic #Morning 5.0 2023
2 #Fitness #Workout 20.0 2023
3 #Travel #Adventure 8.0 2023
4 #Cooking #Food 12.0 2023
In [20]:
#YEAR 2019

import matplotlib.pyplot as plt
import seaborn as sns
Hashtags_2019 = Hashtags_2019.sort_values(by='Retweets', ascending=False)

plt.figure(figsize=(50,20))
ax = sns.barplot(x='Hashtags', y='Retweets', data=Hashtags_2019, palette="inferno")

# Set the title and labels
ax.set_title('Year 2019 Top Hashtags', fontsize=40)
ax.set_xlabel('Hashtags 2019', fontsize=35)
ax.set_ylabel('Total Retweets', fontsize=35)

# Rotate x-axis labels
ax.set_xticklabels(ax.get_xticklabels(), rotation=90, fontsize=30);
ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=30);

# Display the plot
plt.show()
C:\Users\BHAVIN\AppData\Local\Temp\ipykernel_32804\3061164193.py:17: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=30);
In [21]:
#YEAR 2020

import matplotlib.pyplot as plt
import seaborn as sns
Hashtags_2020 = Hashtags_2020.sort_values(by='Retweets', ascending=False)

plt.figure(figsize=(50,20))
ax = sns.barplot(x='Hashtags', y='Retweets', data=Hashtags_2020, palette="viridis")

# Set the title and labels
ax.set_title('Year 2020 Top Hashtags', fontsize=40)
ax.set_xlabel('Hashtags 2020', fontsize=35)
ax.set_ylabel('Total Retweets', fontsize=35)

# Rotate x-axis labels
ax.set_xticklabels(ax.get_xticklabels(), rotation=90, fontsize=30);
ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=30);

# Display the plot
plt.show()
C:\Users\BHAVIN\AppData\Local\Temp\ipykernel_32804\2229645315.py:17: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=30);
In [22]:
#YEAR 2021

import matplotlib.pyplot as plt
import seaborn as sns
Hashtags_2021 = Hashtags_2021.sort_values(by='Retweets', ascending=False)

plt.figure(figsize=(50,20))
ax = sns.barplot(x='Hashtags', y='Retweets', data=Hashtags_2021, palette="magma")

# Set the title and labels
ax.set_title('Year 2021 Top Hashtags', fontsize=40)
ax.set_xlabel('Hashtags 2021', fontsize=35)
ax.set_ylabel('Total Retweets', fontsize=35)

# Rotate x-axis labels
ax.set_xticklabels(ax.get_xticklabels(), rotation=90, fontsize=30);
ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=30);

# Display the plot
plt.show()
C:\Users\BHAVIN\AppData\Local\Temp\ipykernel_32804\1769875039.py:17: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=30);
In [23]:
#YEAR 2022

import matplotlib.pyplot as plt
import seaborn as sns
Hashtags_2022 = Hashtags_2022.sort_values(by='Retweets', ascending=False)

plt.figure(figsize=(50,20))
ax = sns.barplot(x='Hashtags', y='Retweets', data=Hashtags_2022, palette="plasma")

# Set the title and labels
ax.set_title('Year 2022 Top Hashtags', fontsize=40)
ax.set_xlabel('Hashtags 2022', fontsize=35)
ax.set_ylabel('Total Retweets', fontsize=35)

# Rotate x-axis labels
ax.set_xticklabels(ax.get_xticklabels(), rotation=90, fontsize=30);
ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=30);

# Display the plot
plt.show()
C:\Users\BHAVIN\AppData\Local\Temp\ipykernel_32804\3507662078.py:17: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=30);
In [24]:
Hashtags_2022.shape
Out[24]:
(63, 3)
In [25]:
#YEAR 2023

import matplotlib.pyplot as plt
import seaborn as sns
Hashtags_2023 = Hashtags_2023.sort_values(by='Retweets', ascending=False).head(50)

plt.figure(figsize=(50,20))
ax = sns.barplot(x='Hashtags', y='Retweets', data=Hashtags_2023, palette="magma")

# Set the title and labels
ax.set_title('Year 2023 Top 50 Hashtags', fontsize=40)
ax.set_xlabel('Hashtags 2023', fontsize=35)
ax.set_ylabel('Total Retweets', fontsize=35)

# Rotate x-axis labels
ax.set_xticklabels(ax.get_xticklabels(), rotation=90, fontsize=30);
ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=30);

# Display the plot
plt.show()
C:\Users\BHAVIN\AppData\Local\Temp\ipykernel_32804\1846118024.py:17: UserWarning: FixedFormatter should only be used together with FixedLocator
  ax.set_yticklabels(ax.get_yticklabels(), rotation=0, fontsize=30);

The bar plots above showcase the top trending hashtags on Twitter for the years 2019 through 2023,. From these visualizations, we can discern the shifting landscapes of social media discourse and the varying interests of the digital populace over these five years.¶

In 2019, the leading hashtags reveal a focus on personal growth and achievement, as indicated by tags like #SuccessSmiles, #CulinaryJoy, and #GoalDigger, suggesting a collective interest in self-improvement and culinary experiences.¶

By 2020, the trend shifts slightly towards mindfulness and resilience, with hashtags such as #LifeInBloom and #SerenityHues, possibly reflecting a societal response to the global challenges of that year.¶

In 2021, the prominence of hashtags like #DreamBig and #HustleHarder indicates a renewed emphasis on ambition and hard work, perhaps as a reaction to the economic and social pressures of the preceding year.¶

The year 2022 sees a balance between aspiration, represented by #InnovationDrive, and community, suggested by #TogetherWeCan, highlighting a dual focus on technological progress and social unity.¶

Finally, in 2023, the leading hashtags like #HeartwarmingMoments and #SelfLoveWave suggest a pivot towards emotional well-being and self-care, emphasizing the importance of mental health and personal well-being after several tumultuous years.¶

This progression of trending topics not only mirrors the global context of each year but also underscores the evolving priorities and values within the Twitter community. The ebb and flow of these digital conversations reflect the world's collective experiences, changing aspirations, and the shared journey through the years, as captured in the microcosm of social media hashtags.¶

Applying Machine Learning Model for Sentiment Prediction¶

1) Naive Bayes MultinomialNB Model¶

In [26]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import make_pipeline
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report, confusion_matrix


# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(df['Text'], df['Sentiment'], test_size=0.2, random_state=42)

# Create a pipeline that combines the vectorizer with the Naive Bayes classifier
model = make_pipeline(TfidfVectorizer(stop_words='english'), MultinomialNB())

# Fitting model
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

# Check the accuracy
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")

print("Classification Report:\n", classification_report(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
Accuracy: 0.1360544217687075
Classification Report:
                       precision    recall  f1-score   support

          Acceptance       0.00      0.00      0.00         2
          Admiration       0.00      0.00      0.00         2
           Affection       0.00      0.00      0.00         1
         Ambivalence       0.00      0.00      0.00         1
               Anger       0.00      0.00      0.00         1
        Anticipation       0.00      0.00      0.00         1
             Arousal       0.00      0.00      0.00         3
                 Awe       0.00      0.00      0.00         2
                 Bad       0.00      0.00      0.00         1
            Betrayal       0.00      0.00      0.00         3
              Bitter       0.00      0.00      0.00         1
          Bitterness       0.00      0.00      0.00         1
         Bittersweet       0.00      0.00      0.00         1
             Boredom       0.00      0.00      0.00         1
            Calmness       0.00      0.00      0.00         1
         Captivation       0.00      0.00      0.00         1
    Celestial Wonder       0.00      0.00      0.00         1
            Colorful       0.00      0.00      0.00         1
           Confusion       0.00      0.00      0.00         3
          Connection       0.00      0.00      0.00         1
       Contemplation       0.00      0.00      0.00         1
         Contentment       0.00      0.00      0.00         4
            Coziness       0.00      0.00      0.00         1
          Creativity       0.00      0.00      0.00         1
           Curiosity       1.00      0.20      0.33         5
          Desolation       0.00      0.00      0.00         1
          Devastated       0.00      0.00      0.00         2
             Disgust       0.00      0.00      0.00         3
             Elation       0.00      0.00      0.00         3
            Elegance       0.00      0.00      0.00         1
         Embarrassed       0.00      0.00      0.00         1
      EmotionalStorm       0.00      0.00      0.00         1
         Empowerment       0.00      0.00      0.00         1
           Enjoyment       0.00      0.00      0.00         2
          Enthusiasm       0.00      0.00      0.00         1
             Envious       0.00      0.00      0.00         2
 Envisioning History       0.00      0.00      0.00         1
            Euphoria       0.00      0.00      0.00         1
          Excitement       0.33      0.57      0.42         7
                Fear       0.00      0.00      0.00         1
             Fearful       0.00      0.00      0.00         1
          Frustrated       0.00      0.00      0.00         1
         Frustration       0.00      0.00      0.00         3
         Fulfillment       0.00      0.00      0.00         2
            Grateful       0.00      0.00      0.00         1
               Grief       0.00      0.00      0.00         1
               Happy       0.00      0.00      0.00         6
                Hate       0.00      0.00      0.00         2
          Heartbreak       0.00      0.00      0.00         2
             Hopeful       1.00      1.00      1.00         1
        InnerJourney       0.00      0.00      0.00         1
         Inspiration       0.00      0.00      0.00         1
            Inspired       0.00      0.00      0.00         1
           Isolation       0.00      0.00      0.00         1
            Jealousy       0.00      0.00      0.00         1
                 Joy       0.19      0.89      0.31         9
       JoyfulReunion       0.00      0.00      0.00         1
                Kind       0.00      0.00      0.00         1
          Loneliness       0.00      0.00      0.00         2
            LostLove       0.00      0.00      0.00         1
          Melancholy       0.00      0.00      0.00         2
      Miscalculation       0.00      0.00      0.00         1
             Neutral       0.00      0.00      0.00         1
           Nostalgia       0.00      0.00      0.00         2
            Numbness       0.00      0.00      0.00         1
         Overwhelmed       0.00      0.00      0.00         1
             Playful       0.00      0.00      0.00         2
            Positive       0.07      0.67      0.12         9
               Proud       0.00      0.00      0.00         1
          Reflection       0.00      0.00      0.00         1
              Regret       0.00      0.00      0.00         1
          Resilience       0.00      0.00      0.00         1
           Reverence       0.00      0.00      0.00         1
             Sadness       0.00      0.00      0.00         2
        Satisfaction       0.00      0.00      0.00         1
            Serenity       0.00      0.00      0.00         4
            Solitude       0.00      0.00      0.00         1
              Sorrow       0.00      0.00      0.00         1
               Spark       0.00      0.00      0.00         1
            Surprise       0.00      0.00      0.00         1
              Thrill       0.00      0.00      0.00         1
            Vibrancy       0.00      0.00      0.00         1
Whispers of the Past       0.00      0.00      0.00         1
                Zest       0.00      0.00      0.00         1

            accuracy                           0.14       147
           macro avg       0.03      0.04      0.03       147
        weighted avg       0.07      0.14      0.06       147

Confusion Matrix:
 [[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))

The 'accuracy' of the model is 0.14, which indicates that only 14% of the predictions made by the model were correct. This is a very low accuracy rate and suggests that the model is not performing well on the dataset.¶

Macro average precision is 0.03, recall is 0.04, and f1-score is 0.03, which are all very low. This suggests that the model's performance is poor across all classes.¶

The weighted average precision is 0.07, recall is 0.14, and f1-score is 0.06. These scores are also low, indicating that the model does not perform well, especially on classes that have more instances.¶

2) Random Forest Model¶

In [27]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.ensemble import RandomForestClassifier

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(df['Text'], df['Sentiment'], test_size=0.2, random_state=42)

tfidf_vectorizer = TfidfVectorizer(stop_words='english')

# Fit and transform the training data
X_train_tfidf = tfidf_vectorizer.fit_transform(X_train)

X_test_tfidf = tfidf_vectorizer.transform(X_test)

# Initialize the Random Forest classifier
model_rf = RandomForestClassifier(n_estimators=100, random_state=42)

#Fitting the model
model_rf.fit(X_train_tfidf,y_train)

#Predicting the outputs
y_pred_tfidf = model_rf.predict(X_test_tfidf)


# Check the accuracy


print("Classification Report:\n", classification_report(y_test, y_pred_tfidf))
print(f"Accuracy: {accuracy_score(y_test, y_pred_tfidf)}")
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_tfidf))
Classification Report:
                       precision    recall  f1-score   support

          Acceptance       0.67      1.00      0.80         2
          Admiration       0.00      0.00      0.00         2
           Affection       0.00      0.00      0.00         1
         Ambivalence       1.00      1.00      1.00         1
               Anger       0.00      0.00      0.00         1
        Anticipation       0.00      0.00      0.00         1
             Anxiety       0.00      0.00      0.00         0
             Arousal       0.50      0.33      0.40         3
       ArtisticBurst       0.00      0.00      0.00         0
                 Awe       0.33      0.50      0.40         2
                 Bad       1.00      1.00      1.00         1
            Betrayal       0.50      0.33      0.40         3
              Bitter       1.00      1.00      1.00         1
          Bitterness       1.00      1.00      1.00         1
         Bittersweet       0.00      0.00      0.00         1
             Boredom       0.00      0.00      0.00         1
            Calmness       0.00      0.00      0.00         1
         Captivation       0.00      0.00      0.00         1
    Celestial Wonder       0.00      0.00      0.00         1
            Colorful       0.00      0.00      0.00         1
           Confusion       0.67      0.67      0.67         3
          Connection       0.00      0.00      0.00         1
       Contemplation       0.00      0.00      0.00         1
         Contentment       1.00      0.25      0.40         4
            Coziness       1.00      1.00      1.00         1
          Creativity       0.00      0.00      0.00         1
           Curiosity       1.00      0.60      0.75         5
          Desolation       0.50      1.00      0.67         1
             Despair       0.00      0.00      0.00         0
          Devastated       0.00      0.00      0.00         2
             Disgust       0.00      0.00      0.00         3
             Elation       1.00      1.00      1.00         3
            Elegance       0.00      0.00      0.00         1
         Embarrassed       1.00      1.00      1.00         1
      EmotionalStorm       0.00      0.00      0.00         1
         Empowerment       0.00      0.00      0.00         1
         Enchantment       0.00      0.00      0.00         0
           Enjoyment       0.00      0.00      0.00         2
          Enthusiasm       0.50      1.00      0.67         1
             Envious       0.00      0.00      0.00         2
 Envisioning History       0.00      0.00      0.00         1
            Euphoria       1.00      1.00      1.00         1
          Excitement       0.50      0.43      0.46         7
                Fear       0.00      0.00      0.00         1
             Fearful       1.00      1.00      1.00         1
          Frustrated       0.00      0.00      0.00         1
         Frustration       1.00      0.33      0.50         3
         Fulfillment       1.00      1.00      1.00         2
            Grateful       1.00      1.00      1.00         1
           Gratitude       0.00      0.00      0.00         0
               Grief       1.00      1.00      1.00         1
           Happiness       0.00      0.00      0.00         0
               Happy       0.00      0.00      0.00         6
                Hate       0.00      0.00      0.00         2
          Heartbreak       0.00      0.00      0.00         2
             Hopeful       1.00      1.00      1.00         1
        InnerJourney       0.00      0.00      0.00         1
         Inspiration       0.50      1.00      0.67         1
            Inspired       1.00      1.00      1.00         1
           Isolation       0.00      0.00      0.00         1
            Jealousy       0.00      0.00      0.00         1
                 Joy       0.56      0.56      0.56         9
       JoyfulReunion       0.00      0.00      0.00         1
                Kind       0.00      0.00      0.00         1
          Loneliness       1.00      1.00      1.00         2
            LostLove       0.00      0.00      0.00         1
                Love       0.00      0.00      0.00         0
          Melancholy       1.00      1.00      1.00         2
      Miscalculation       0.00      0.00      0.00         1
            Negative       0.00      0.00      0.00         0
             Neutral       0.00      0.00      0.00         1
           Nostalgia       1.00      0.50      0.67         2
            Numbness       1.00      1.00      1.00         1
         Overwhelmed       1.00      1.00      1.00         1
             Playful       1.00      0.50      0.67         2
            Positive       0.12      0.89      0.22         9
               Proud       1.00      1.00      1.00         1
          Reflection       0.00      0.00      0.00         1
              Regret       1.00      1.00      1.00         1
          Resentment       0.00      0.00      0.00         0
          Resilience       0.00      0.00      0.00         1
           Reverence       1.00      1.00      1.00         1
               Ruins       0.00      0.00      0.00         0
             Sadness       0.00      0.00      0.00         2
        Satisfaction       0.00      0.00      0.00         1
            Serenity       1.00      0.50      0.67         4
            Solitude       0.00      0.00      0.00         1
              Sorrow       0.00      0.00      0.00         1
               Spark       0.00      0.00      0.00         1
            Surprise       0.00      0.00      0.00         1
              Thrill       0.00      0.00      0.00         1
            Vibrancy       0.00      0.00      0.00         1
Whispers of the Past       0.00      0.00      0.00         1
                Zest       0.00      0.00      0.00         1

            accuracy                           0.41       147
           macro avg       0.34      0.33      0.32       147
        weighted avg       0.44      0.41      0.39       147

Accuracy: 0.41496598639455784
Confusion Matrix:
 [[2 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [1 0 0 ... 0 0 0]]
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))

Note: The warnings in the output above indicate that the precision, recall, and F-score are ill-defined for some labels in your dataset because there are no predicted samples for those labels, or no true samples, which means that for some classes, there were either no predictions made by the model, or there are no instances in the true labels. This can happen in a highly imbalanced dataset where one or more classes are underrepresented.¶

The warning message suggests using the zero_division parameter to control this behavior. When calculating precision, recall, and F-scores, a division by zero can occur if there are no true positives or predicted positives for a given class. By setting zero_division=1, you're essentially saying "in cases where division by zero would occur, just assume a value of 1 (100%) for that particular precision/recall calculation." This is a way to handle the warning, but it may not be the most informative for your model evaluation.¶

In this case, zero_division=0 means that you will assign 0 (0%) instead of 1 (100%) when division by zero occurs, which may be more appropriate in most cases.¶

In [28]:
print("Classification Report:\n", classification_report(y_test, y_pred_tfidf,zero_division=0))
print(f"Accuracy: {accuracy_score(y_test, y_pred_tfidf)}")

print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_tfidf))
Classification Report:
                       precision    recall  f1-score   support

          Acceptance       0.67      1.00      0.80         2
          Admiration       0.00      0.00      0.00         2
           Affection       0.00      0.00      0.00         1
         Ambivalence       1.00      1.00      1.00         1
               Anger       0.00      0.00      0.00         1
        Anticipation       0.00      0.00      0.00         1
             Anxiety       0.00      0.00      0.00         0
             Arousal       0.50      0.33      0.40         3
       ArtisticBurst       0.00      0.00      0.00         0
                 Awe       0.33      0.50      0.40         2
                 Bad       1.00      1.00      1.00         1
            Betrayal       0.50      0.33      0.40         3
              Bitter       1.00      1.00      1.00         1
          Bitterness       1.00      1.00      1.00         1
         Bittersweet       0.00      0.00      0.00         1
             Boredom       0.00      0.00      0.00         1
            Calmness       0.00      0.00      0.00         1
         Captivation       0.00      0.00      0.00         1
    Celestial Wonder       0.00      0.00      0.00         1
            Colorful       0.00      0.00      0.00         1
           Confusion       0.67      0.67      0.67         3
          Connection       0.00      0.00      0.00         1
       Contemplation       0.00      0.00      0.00         1
         Contentment       1.00      0.25      0.40         4
            Coziness       1.00      1.00      1.00         1
          Creativity       0.00      0.00      0.00         1
           Curiosity       1.00      0.60      0.75         5
          Desolation       0.50      1.00      0.67         1
             Despair       0.00      0.00      0.00         0
          Devastated       0.00      0.00      0.00         2
             Disgust       0.00      0.00      0.00         3
             Elation       1.00      1.00      1.00         3
            Elegance       0.00      0.00      0.00         1
         Embarrassed       1.00      1.00      1.00         1
      EmotionalStorm       0.00      0.00      0.00         1
         Empowerment       0.00      0.00      0.00         1
         Enchantment       0.00      0.00      0.00         0
           Enjoyment       0.00      0.00      0.00         2
          Enthusiasm       0.50      1.00      0.67         1
             Envious       0.00      0.00      0.00         2
 Envisioning History       0.00      0.00      0.00         1
            Euphoria       1.00      1.00      1.00         1
          Excitement       0.50      0.43      0.46         7
                Fear       0.00      0.00      0.00         1
             Fearful       1.00      1.00      1.00         1
          Frustrated       0.00      0.00      0.00         1
         Frustration       1.00      0.33      0.50         3
         Fulfillment       1.00      1.00      1.00         2
            Grateful       1.00      1.00      1.00         1
           Gratitude       0.00      0.00      0.00         0
               Grief       1.00      1.00      1.00         1
           Happiness       0.00      0.00      0.00         0
               Happy       0.00      0.00      0.00         6
                Hate       0.00      0.00      0.00         2
          Heartbreak       0.00      0.00      0.00         2
             Hopeful       1.00      1.00      1.00         1
        InnerJourney       0.00      0.00      0.00         1
         Inspiration       0.50      1.00      0.67         1
            Inspired       1.00      1.00      1.00         1
           Isolation       0.00      0.00      0.00         1
            Jealousy       0.00      0.00      0.00         1
                 Joy       0.56      0.56      0.56         9
       JoyfulReunion       0.00      0.00      0.00         1
                Kind       0.00      0.00      0.00         1
          Loneliness       1.00      1.00      1.00         2
            LostLove       0.00      0.00      0.00         1
                Love       0.00      0.00      0.00         0
          Melancholy       1.00      1.00      1.00         2
      Miscalculation       0.00      0.00      0.00         1
            Negative       0.00      0.00      0.00         0
             Neutral       0.00      0.00      0.00         1
           Nostalgia       1.00      0.50      0.67         2
            Numbness       1.00      1.00      1.00         1
         Overwhelmed       1.00      1.00      1.00         1
             Playful       1.00      0.50      0.67         2
            Positive       0.12      0.89      0.22         9
               Proud       1.00      1.00      1.00         1
          Reflection       0.00      0.00      0.00         1
              Regret       1.00      1.00      1.00         1
          Resentment       0.00      0.00      0.00         0
          Resilience       0.00      0.00      0.00         1
           Reverence       1.00      1.00      1.00         1
               Ruins       0.00      0.00      0.00         0
             Sadness       0.00      0.00      0.00         2
        Satisfaction       0.00      0.00      0.00         1
            Serenity       1.00      0.50      0.67         4
            Solitude       0.00      0.00      0.00         1
              Sorrow       0.00      0.00      0.00         1
               Spark       0.00      0.00      0.00         1
            Surprise       0.00      0.00      0.00         1
              Thrill       0.00      0.00      0.00         1
            Vibrancy       0.00      0.00      0.00         1
Whispers of the Past       0.00      0.00      0.00         1
                Zest       0.00      0.00      0.00         1

            accuracy                           0.41       147
           macro avg       0.34      0.33      0.32       147
        weighted avg       0.44      0.41      0.39       147

Accuracy: 0.41496598639455784
Confusion Matrix:
 [[2 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [1 0 0 ... 0 0 0]]

The model's accuracy is about 41.4%, indicating less than half of the predictions were correct. The confusion matrix suggests several classes had no predictions at all, which may point to class imbalance or a need for model refinement. There are warnings related to precision and recall for at least one class, where no predictions were made. Overall, the model's performance appears to be suboptimal and could benefit from further tuning and analysis.¶

3) SVM Model¶

In [29]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.svm import SVC

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(df['Text'], df['Sentiment'], test_size=0.2, random_state=42)

tfidf_vectorizer = TfidfVectorizer(stop_words='english')

# Fit and transform the training data
X_train_tfidf = tfidf_vectorizer.fit_transform(X_train)

X_test_tfidf = tfidf_vectorizer.transform(X_test)

# Build and train the SVM model
svm_model = SVC()
svm_model.fit(X_train_tfidf, y_train)

y_pred_svm = svm_model.predict(X_test_tfidf)

# Check the accuracy


print("Classification Report:\n", classification_report(y_test, y_pred_svm))

print(f"Accuracy: {accuracy_score(y_test, y_pred_svm)}")
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_svm))
Classification Report:
                       precision    recall  f1-score   support

          Acceptance       1.00      0.50      0.67         2
          Admiration       0.00      0.00      0.00         2
           Affection       0.00      0.00      0.00         1
         Ambivalence       0.00      0.00      0.00         1
               Anger       0.00      0.00      0.00         1
        Anticipation       0.00      0.00      0.00         1
             Arousal       0.00      0.00      0.00         3
                 Awe       0.00      0.00      0.00         2
                 Bad       0.00      0.00      0.00         1
            Betrayal       0.00      0.00      0.00         3
              Bitter       0.00      0.00      0.00         1
          Bitterness       0.00      0.00      0.00         1
         Bittersweet       0.00      0.00      0.00         1
             Boredom       0.00      0.00      0.00         1
            Calmness       0.00      0.00      0.00         1
         Captivation       0.00      0.00      0.00         1
    Celestial Wonder       0.00      0.00      0.00         1
            Colorful       0.00      0.00      0.00         1
           Confusion       0.00      0.00      0.00         3
          Connection       0.00      0.00      0.00         1
       Contemplation       0.00      0.00      0.00         1
         Contentment       0.00      0.00      0.00         4
            Coziness       0.00      0.00      0.00         1
          Creativity       0.00      0.00      0.00         1
           Curiosity       1.00      0.20      0.33         5
          Desolation       0.00      0.00      0.00         1
          Devastated       0.00      0.00      0.00         2
             Disgust       0.00      0.00      0.00         3
             Elation       0.00      0.00      0.00         3
            Elegance       0.00      0.00      0.00         1
         Embarrassed       0.00      0.00      0.00         1
      EmotionalStorm       0.00      0.00      0.00         1
         Empowerment       0.00      0.00      0.00         1
           Enjoyment       0.00      0.00      0.00         2
          Enthusiasm       0.00      0.00      0.00         1
             Envious       0.00      0.00      0.00         2
 Envisioning History       0.00      0.00      0.00         1
            Euphoria       0.00      0.00      0.00         1
          Excitement       0.57      0.57      0.57         7
                Fear       0.00      0.00      0.00         1
             Fearful       0.00      0.00      0.00         1
          Frustrated       0.00      0.00      0.00         1
         Frustration       0.00      0.00      0.00         3
         Fulfillment       0.00      0.00      0.00         2
            Grateful       1.00      1.00      1.00         1
               Grief       0.00      0.00      0.00         1
               Happy       0.00      0.00      0.00         6
                Hate       0.00      0.00      0.00         2
          Heartbreak       0.00      0.00      0.00         2
             Hopeful       1.00      1.00      1.00         1
        InnerJourney       0.00      0.00      0.00         1
         Inspiration       0.00      0.00      0.00         1
            Inspired       1.00      1.00      1.00         1
           Isolation       0.00      0.00      0.00         1
            Jealousy       0.00      0.00      0.00         1
                 Joy       0.10      1.00      0.19         9
       JoyfulReunion       0.00      0.00      0.00         1
                Kind       0.00      0.00      0.00         1
          Loneliness       1.00      0.50      0.67         2
            LostLove       0.00      0.00      0.00         1
          Melancholy       0.00      0.00      0.00         2
      Miscalculation       0.00      0.00      0.00         1
             Neutral       0.00      0.00      0.00         1
           Nostalgia       0.00      0.00      0.00         2
            Numbness       0.00      0.00      0.00         1
         Overwhelmed       0.00      0.00      0.00         1
             Playful       0.00      0.00      0.00         2
            Positive       0.11      0.56      0.19         9
               Proud       1.00      1.00      1.00         1
          Reflection       0.00      0.00      0.00         1
              Regret       0.00      0.00      0.00         1
          Resilience       0.00      0.00      0.00         1
           Reverence       0.00      0.00      0.00         1
             Sadness       0.00      0.00      0.00         2
        Satisfaction       0.00      0.00      0.00         1
            Serenity       1.00      0.25      0.40         4
            Solitude       0.00      0.00      0.00         1
              Sorrow       0.00      0.00      0.00         1
               Spark       0.00      0.00      0.00         1
            Surprise       0.00      0.00      0.00         1
              Thrill       0.00      0.00      0.00         1
            Vibrancy       0.00      0.00      0.00         1
Whispers of the Past       0.00      0.00      0.00         1
                Zest       0.00      0.00      0.00         1

            accuracy                           0.18       147
           macro avg       0.10      0.09      0.08       147
        weighted avg       0.16      0.18      0.12       147

Accuracy: 0.17687074829931973
Confusion Matrix:
 [[1 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
D:\Python\lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))

The accuracy of the model is just 17.68% indicating that more than 82% predictions were incorrect.The Matix suggest that several classes had no predictions which may suggest a class imbalance. There are warnings for related to precision adn recall for atleast one class where no predictions were made. OVerall this model is not a good fit to predict the sentiments of the text data.¶

4) Logistic Regression Model¶

In [30]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.linear_model import LogisticRegression

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(df['Text'], df['Sentiment'], test_size=0.2, random_state=42)

tfidf_vectorizer = TfidfVectorizer(stop_words='english')

# Fit and transform the training data
X_train_tfidf = tfidf_vectorizer.fit_transform(X_train)

X_test_tfidf = tfidf_vectorizer.transform(X_test)

#Using the Model
Logistic_model = LogisticRegression(random_state=42)

#Fitting the model
Logistic_model.fit(X_train_tfidf, y_train)

#Predicting the output
y_pred_log = Logistic_model.predict(X_test_tfidf)

# Check the accuracy
print("Classification Report:\n", classification_report(y_test, y_pred_log, zero_division=0))
print(f"Accuracy: {accuracy_score(y_test, y_pred_log)}")
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_log))
Classification Report:
                       precision    recall  f1-score   support

          Acceptance       1.00      1.00      1.00         2
          Admiration       0.00      0.00      0.00         2
           Affection       0.00      0.00      0.00         1
         Ambivalence       0.00      0.00      0.00         1
               Anger       0.00      0.00      0.00         1
        Anticipation       0.00      0.00      0.00         1
             Arousal       0.00      0.00      0.00         3
                 Awe       0.00      0.00      0.00         2
                 Bad       0.00      0.00      0.00         1
            Betrayal       0.00      0.00      0.00         3
              Bitter       0.00      0.00      0.00         1
          Bitterness       0.00      0.00      0.00         1
         Bittersweet       0.00      0.00      0.00         1
             Boredom       0.00      0.00      0.00         1
            Calmness       0.00      0.00      0.00         1
         Captivation       0.00      0.00      0.00         1
    Celestial Wonder       0.00      0.00      0.00         1
            Colorful       0.00      0.00      0.00         1
           Confusion       0.00      0.00      0.00         3
          Connection       0.00      0.00      0.00         1
       Contemplation       0.00      0.00      0.00         1
         Contentment       1.00      0.25      0.40         4
            Coziness       0.00      0.00      0.00         1
          Creativity       0.00      0.00      0.00         1
           Curiosity       1.00      0.60      0.75         5
          Desolation       0.00      0.00      0.00         1
             Despair       0.00      0.00      0.00         0
          Devastated       0.00      0.00      0.00         2
             Disgust       0.00      0.00      0.00         3
             Elation       0.00      0.00      0.00         3
            Elegance       0.00      0.00      0.00         1
         Embarrassed       0.00      0.00      0.00         1
      EmotionalStorm       0.00      0.00      0.00         1
         Empowerment       0.00      0.00      0.00         1
           Enjoyment       0.00      0.00      0.00         2
          Enthusiasm       0.00      0.00      0.00         1
             Envious       0.00      0.00      0.00         2
 Envisioning History       0.00      0.00      0.00         1
            Euphoria       0.00      0.00      0.00         1
          Excitement       0.25      0.71      0.37         7
                Fear       0.00      0.00      0.00         1
             Fearful       0.00      0.00      0.00         1
          Frustrated       0.00      0.00      0.00         1
         Frustration       0.00      0.00      0.00         3
         Fulfillment       0.00      0.00      0.00         2
            Grateful       0.00      0.00      0.00         1
               Grief       0.00      0.00      0.00         1
               Happy       0.00      0.00      0.00         6
                Hate       0.00      0.00      0.00         2
          Heartbreak       0.00      0.00      0.00         2
             Hopeful       1.00      1.00      1.00         1
        InnerJourney       0.00      0.00      0.00         1
         Inspiration       0.00      0.00      0.00         1
            Inspired       0.00      0.00      0.00         1
           Isolation       0.00      0.00      0.00         1
            Jealousy       0.00      0.00      0.00         1
                 Joy       0.16      1.00      0.28         9
       JoyfulReunion       0.00      0.00      0.00         1
                Kind       0.00      0.00      0.00         1
          Loneliness       1.00      0.50      0.67         2
            LostLove       0.00      0.00      0.00         1
          Melancholy       0.00      0.00      0.00         2
      Miscalculation       0.00      0.00      0.00         1
             Neutral       0.00      0.00      0.00         1
           Nostalgia       0.00      0.00      0.00         2
            Numbness       0.00      0.00      0.00         1
         Overwhelmed       0.00      0.00      0.00         1
             Playful       0.00      0.00      0.00         2
            Positive       0.08      0.56      0.14         9
               Proud       0.00      0.00      0.00         1
          Reflection       0.00      0.00      0.00         1
              Regret       0.00      0.00      0.00         1
          Resilience       0.00      0.00      0.00         1
           Reverence       0.00      0.00      0.00         1
             Sadness       0.00      0.00      0.00         2
        Satisfaction       0.00      0.00      0.00         1
            Serenity       1.00      0.50      0.67         4
            Solitude       0.00      0.00      0.00         1
              Sorrow       0.00      0.00      0.00         1
               Spark       0.00      0.00      0.00         1
            Surprise       0.00      0.00      0.00         1
              Thrill       0.00      0.00      0.00         1
            Vibrancy       0.00      0.00      0.00         1
Whispers of the Past       0.00      0.00      0.00         1
                Zest       0.00      0.00      0.00         1

            accuracy                           0.20       147
           macro avg       0.08      0.07      0.06       147
        weighted avg       0.15      0.20      0.13       147

Accuracy: 0.19727891156462585
Confusion Matrix:
 [[2 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]

The output from a classification model evaluation indicates the accuracy of the model is low at approximately 19.73%. The classification report indicates that for many classes, the model did not successfully predict any instances (as indicated by scores of 0.00 across precision, recall, and F1-score). The confusion matrix confirms that the model's predictions are highly concentrated in one or a few classes, with no predictions for most other classes.¶

In [31]:
Output = ['Naive Bayes MultinomialNB : 16.7%', 'Random Forest Model : 41.49%', 'SVM Model : 17.68%', 'Logistic Regression Model : 19.72%'] ]
  Cell In[31], line 1
    Output = ['Naive Bayes MultinomialNB : 16.7%', 'Random Forest Model : 41.49%', 'SVM Model : 17.68%', 'Logistic Regression Model : 19.72%'] ]
                                                                                                                                               ^
SyntaxError: unmatched ']'
In [ ]:
output_dict = {
    'Naive Bayes MultinomialNB': '16.7%',  
    'Random Forest Model': '41.49%',       
    'SVM Model': '17.68%',                 
    'Logistic Regression Model': '19.72%'  
}

Model_accuracy = pd.DataFrame(list(output_dict.items()), columns=['Model', 'Accuracy'])

print(Model_accuracy)

It can be observed from the table above that Random Forest Model has the maximum accuracy for our dataset. We will hence create a function to predict the sentiment of a text using the Random Forest Model¶

In [37]:
def sentiment(text):
    # Transform the input text to the same feature space as the trained model
    text_tfidf = tfidf_vectorizer.transform([text])
    # Predict the sentiment
    predict = model_rf.predict(text_tfidf)
    return predict[0]  # Return the predicted sentiment

while True:
    Yes_No = input('Do you want to continue? Please type Y for yes and N for No: ').strip().upper()
    if Yes_No == 'Y':
        text = input('Please enter your text: ')
        predicted_sentiment = sentiment(text)
        print(f'The predicted sentiment is: {predicted_sentiment}')
    elif Yes_No == 'N':
        print("Exiting...")
        break
    else:
        print("Invalid input. Please enter Y for yes or N for no.")
Do you want to continue? Please type Y for yes and N for No: Y
Please enter your text: stars are bright
The predicted sentiment is: Positive
Do you want to continue? Please type Y for yes and N for No: Y
Please enter your text: I am Excited for the trip
The predicted sentiment is: Positive
Do you want to continue? Please type Y for yes and N for No: Arousal of excitement before the results
Invalid input. Please enter Y for yes or N for no.
Do you want to continue? Please type Y for yes and N for No: Y
Please enter your text: Arousal of excitement before the result
The predicted sentiment is: Arousal
Do you want to continue? Please type Y for yes and N for No: Y
Please enter your text: The leaver of the flower are quite tender.
The predicted sentiment is: Positive
Do you want to continue? Please type Y for yes and N for No: Y
Please enter your text: Tenderness is in everything
The predicted sentiment is: Tenderness
Do you want to continue? Please type Y for yes and N for No: One shouldn't be jealous of anyone's progress
Invalid input. Please enter Y for yes or N for no.
Do you want to continue? Please type Y for yes and N for No: Y
Please enter your text: One shouldn't be jealous of anyone's progress
The predicted sentiment is: Positive
Do you want to continue? Please type Y for yes and N for No: y
Please enter your text: Anxiety is in my heart
The predicted sentiment is: Anxiety
Do you want to continue? Please type Y for yes and N for No: Y
Please enter your text: I feel Nostalgia when seeing old pictures
The predicted sentiment is: Nostalgia
Do you want to continue? Please type Y for yes and N for No: N
Exiting...
In [ ]: